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OX    THE    MEASUREMENT    OF    CORRELATION    WITH 
SPECIAL  REFERENCE  TO  SOME  CHAR- 
ACTERS OF  INDIAN  CORN 

BY  HENRY  L.  RIETZ,  Statistican,  and  LOUIE  H.  SMITH,  Assistant 
Chief,    Plant   Breeding 

INTRODUCTION 

In  Bulletin  119  of  this  station,  there  are  presented  methods  of 
dealing  with  problems  involving  variability  of  a  single  character 
and  these  methods  are  there  applied  to  the  study  of  type  and 
variability  of  some  characters  in  corn. 

But  the  breeder  deals  with  many  characters  in  the  same  organ- 
ism, each  with  its  own  variability.  After  treating  separate  char- 
acters, what  he  needs  next  to  know  is  whether,  and  to  what  extent, 
any  bond  may  exist  between  characters  by  virtue  of  which,  if  one 
character  varies,  other  characters  of  the  same  organism  tend  also 
to  move  in  the  same  or  in  opposite  directions. 

If  such  a  bond  exists  the  characters  are  said  to  be  correlated 
(co-related),  and  it  is  the  purpose  of  this  bulletin  to  describe 
methods  by  which  such  a  correlation  may  be  detected  if  present 
and  the  strength  of  its  bond  be  measured. 

A  second  purpose  of  the  bulletin  is  to  present  data  concerning 
certain  definite  correlations  for  corn  bred  at  the  Illinois  Station. 

The  great  value  to  the  breeder  of  definite  knowledge  of  corre- 
lations within  a  species  is  that  it  gives  reliable  information,  enab- 
ling him  to  predict  from  the  presence  of  certain  characters  the 
most  probable  values  of  associated  characters. 

More  technically  speaking,  when  we  are  dealing  with  two  sys- 
tems of  variable  characters  in  correspondence,  we  are,  in  general, 
much  concerned  about  whether  fluctuations  of  variates  in  one  sys- 
tem are  in  sympathy  with  fluctuations  of  corresponding  variates 
in  the  other,  and  with  establishing  causal  relations  between  the 
two  series  of  phenomena.  For  example,  in  breeding  corn  for  com- 
mercial purposes,  we  are  much  interested  in  knowing  what  charac- 
ters of  the  seed  ears  should  be  modified  or  selected  to  increase 
the  yield ;  and,  if  we  should  select  directly  one  character,  it  is  im- 
portant to  know  to  what  extent  other  characters  are  being  se- 
lected indirectly,  because  of  the  tendency  of  the  two  to  fluctuate  in 
the  same  or  in  opposite  directions. 
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These  examples  illustrate  the  following  technical  definition  of 
correlation;  Two  characters — say  length  and  circumference  of 
ears  of  corn — are  said  to  be  correlated  when  with  any  selected 
values  (.r)  of  the  one  character,  we  find  that  values  of  the  other 
character,  a  given  amount  above  and  below  the  mean  of  that  char- 
acter, are  not  equally  likely  to  be  associated. 

As  the  first  and  simplest  method,  it  may  possibly  occur  that 
correlation  is  so  pronounced  that  it  may  be  necessary  merely  to 
look  at  two  sets  of  figures  to  note  that  corresponding  values  have 
a  tendency  to  change  simultaneously  in  the  same  or  in  opposite 
directions.  The  existence  of  such  decided  correlation  may  be 
known  by  inspection.. 

As  a  second,  and  somewhat  more  effective  method,  one  may  plot 
curves  for  each  of  two  systems  of  variates,  and  if  correlation  is 
very  pronounced,  it  may  sometimes  be  discovered  by  noting 
whether  the  curves  have  a  tendency  to  rise  and  fall  together,  or 
if,  when  one  rises,  the  other  falls. 

Not  only  do  these  methods  prove  inadequate  to  detect  correla- 
tion unless  it  is  exceedingly  pronounced,  but  they  lack  precision  in 
that  they  do  not  give  a  measure  of  correlation.  It  is  not  enough 
to  know  whether  correlation  exists,  its  quantitative  measure  is 
usually  a  matter  of  importance. 

Our  power  to  measure  the  correlations  among  associated  phe- 
nomena has  been  enormously  increased  during  the  past  two  dec- 
ades by  methods  introduced  by  Galton  and  developed  by  Pearson 
and  those  associated  with  him.  An  application  of  these  methods 
has  not  until  very  recently  been  made  to  problems  in  agriculture.* 

It  is  the  purpose  of  this  Bulletin  to  present  in  a  form  useful  to 
agricultural  students  the  methods  of  correlation  measurement  with- 
out presuming  more  mathematics  than  is  absolutely  necessary,  and 
to  give  the  results  of  our  investigations  into  the  correlation  of 
certain  characters  in  corn  bred  at  the  Illinois  Station. 

I.  The  Correlation  Table. — The  first  step  in  the  process  of 
measuring  correlation  is  to  construct  a  double,  entry  table  (Fig.  i) 
— called  a  "correlation  table" — out  of  the  measurements  of  the 
characters  in  a  large  number  of  individuals.  One  mark  at  the 
intersection  of  the  proper  column  and  row  in  the  table  records  a 
pair  of  corresponding  variates  with  reference  to  two  characters. 

Put  in  tabular  form  as  it  appears  in  the  actual  work,  we  have 
the  following  (Fig.  i)  for  the  correlation  table  between  the  num- 


*Davenport,  Principles   of  Breeding,  pp.   452-472,   703-711. 
Pearl  and  Surface,  Bulletin  166,  Maine  Agr.  Exp.  Sta. 
Clark,  Bulletin    2~<),  Cornell  University  Agr.  Exp.  Sta. 
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her  of  rows  of  kernels  on  ears  and  the  circumference  of  the  ears 
for  a  certain  plat  of  corn  (Plot  401)  grown  in  1907  at  the  Illinois 
Experiment  Station. 

It  will  be  observed  that  this  table  consists  of  a  double  system 
of  arrays  each  of  which  is  a  frequency  distribution  as  explained 
in  Bui.  119,  and  has  its  own  mean  and  standard  deviation  as  has 
any  other  frequency  distribution. 

To  show,  in  a  concrete  way,  how  such  a  table  is  made,  suppose 
an  ear  of  corn  has  18  rows  of  kernels  and  a  circumference  be- 
tween 6.375  and  6.625  in.,  a  mark  is  made  in  the  rectangle  at  the 
intersection  of  the  column  headed  18  and  the  row  of  the  table 
marked  6.50.  A  second  table  (Fig.  2)  exhibits  the  result  of 
counting  the  marks  in  each  of  the  rectangles  of  Fig.  i. 

Any  number  in  this  table,  say  43,  in  the  column  headed  18 
and  the  row  marked  6.50,  indicates  that  43  ears  of  the  total  of 
769  ears  had  18  rows  of  kernels  and  a  circumference  of  class 
mark  6.50. 

By  adding  the  numbers  in  horizontal  arrays,  we  obtain  the 
frequency  distribution  of  the  population  with  respect  to  circum- 
ference of  ears,  and  by  adding  the  numbers  in  columns,  we  ob- 
tain the  frequency  distribution  of  the  population  with  respect  to 
rows  of  kernels  on  ears  (See  Fig.  3). 

The  mere  superficial  inspection  of  a  correlation  table  may  sug- 
gest that  a  certain  amount  of  correlation  exists.  For  example, 
ears  of  corn  of  circumference  7.50  inches,  from  this  population, 
are  much  more  likely  to  have  20  rows  of  kernels  than  are  ears  of 
circumference  6  inches.  It  is  pretty  clear  that  there  is  a  tendency, 
in  general,  for  the  marks  in  the  table  of  Fig.  i  to  arrange  them- 
selves in  a  region  along  the  diagonal  from  the  upper  left  hand  cor- 
ner to  the  lower  right  hand  corner  of  the  table.  This  signifies  that 
a  positive  correlation  exists;  that  is,  in  general,  for  this  popula- 
tion, ears  that  have  a  large  number  of  rows  of  kernels  are  more 
likely  to  be  large  in  circumference  than  are  ears  with  a  smaller 
number  of  rows  of  kernels.  But  it  is  not  our  purpose  merely  to 
detect  the  existence  of  correlation.  What  we  seek  is  a  statistical 
coefficient  that  will  serve  to  measure  correlation,  and  that  will  en- 
able us  to  predict  with  as  high  a  degree  of  probability  as  possible, 
from  an  assigned  character,  the  value  of  the  associated  character 
in  the  related  system  of  variates.  The  coefficient  of  correlation,  de- 
noted by  r  in  this  paper,  is  useful  for  this  purpose. 

2.  Nature  of  the  coefficient  r. — A  discussion  of  the  mathemat- 
ical theory  of  correlation  will  be  given  in  the  Appendix  to  this 
Bulletin,  but  the  general  character  and  common  sense  significance 
of  r  may  well  be  stated  here.  The  value  of  the  coefficient  is  within 
the  limits  — i  and  +i.  If  r=i,  there  is  said  to  be  perfect  positive 
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correlation ;  that  is,  for  any  assigned  value  of  the  character  in  one 
system,  the  value  for  each  corresponding  individual  of  the  related 
system  is  known,  and  the  ratio  of  the  deviations  of  any  two  vari- 
ates  of  a  pair  from  their  mean  values  is  a  constant  for  all  pairs. 
In  other  words,  perfect  correlation  (r=i)  indicates  complete  cau- 
sation in  the  sense  that  the  two  characters  go  together  perfectly. 
If  r= — i,  there  is  said  to  be  perfect  negative  correlation.  In  this 
case,  the  ratio  of  the  deviations  from  mean  values  are  negative  and 
constant  for  all  pairs.  If  no  correlation  exists,  the  two  characters 
appear  indifferent  to  each  other,  and  this  fact  is  expressed  by  r=o. 
In  a  general  way,  we  may  say  that  the  correlation  should  be  judged, 
in  any  application,  by  the  value  that  r  takes  between  — i  and  +i. 
For  our  applications  to  characters  in  corn,  there  is  usually  a  posi- 
tive correlation,  and  the  amount  of  correlation  is  measured  by  the 
value  of  r  between  o  and  i. 

The  correlation  coefficient  may  be  defined  as  the  mean  product 
of  deviations  of  corresponding  variates  from  their  mean  values  in 
units  of  the  standard  deiiations. 

The  meaning  of  the  standard  deviation  of  a  frequency  distri- 
bution is  shown  in  Bulletin  119  of  this  Station.  If  a  variate  is 
below  the  mean,  its  deviation  is  negative;  while  if  it  is  above  the 
mean,  it  is  positive.  Hence,  if  each  individual  of  a  pair  of  variates 
is  above  the  mean  of  the  system  to  which  it  belongs,  or  if  each  of 
the  pair  is  below,  the  pair  tends  to  contribute  to  positive  correla- 
tion. On  the  other  hand,  if  one  variate  of  a  pair  is  below  the  mean 
of  its  system  and  the  other  above,  the  product  is  negative,  and 
such  a  pair  tends  to  contribute  to  negative  correlation.  While  it 
appears  from  this  that  the  coefficient  of  correlation,  as  defined 
above,  has  a  common  sense  justification,  we  shall  require  the  math- 
ematical methods  of  the  Appendix  to  see  more  fully  how  this  co- 
efficient with  the  standard  deviations  of  the  two  systems  of  vari- 
ates are  descriptive  of  the  correlated  population  exhibited  on  a 
correlation  table  such  as  is  shown  in  Fig.  i. 

3.  Details  of  the  computation  of  r. — In  algebraic  form 


r  = 


-  -  -  -  (1) 


where  Sxy  means  the  sum  of  the  products  of  the  deviations  of 
corresponding  variates  from  their  mean  values ;  and  <rx,  oy  are 
standard  deviations,  while  n  is  the  number  of  pairs  of  variates. 

As  we  shall  use  the  correlation  table  of  Fig.  2  to  illustrate  a 
systematic  arrangement  of  the  work  in  the  computation  of  r,  the 
formula  (i)  may  appear  more  significant  in  the  application  by 
writing  it  in  the  form 
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D  D 


where  the  subscripts  c  and  R  refer  to  circumference  and  rows  of 
kernels  respectively,  while  the  D's  represent  deviations  of  charac- 
ters indicated  by  subscripts.  That  is  to  say,  Dc  is  the  deviation 
of  the  circumference  of  an  ear  from  the  mean  circumference,  and 
DRthe  deviation  of  the  number  of  rows  of  kernels  from. the  mean 
of  the  number  of  rows  of  kernels. 

There  is  derived  in  the  Appendix,  pp.  313-314,  a  formula  which 
gives  the  same  numerical  value  as 

^^  *"~"~^ 

*^  XV  *^  D   D 

_        or  c  R 


and,  while  its  algebraic  expression  is  a  little  more  complicated,  it 
is  much  better  adapted  to  numerical  computation  than  the  above 
formula,  as  it  avoids  the  use  of  decimals  until  almost  the  end  of 
the  work.  In  this  respect,  it  is  analogous  to  the  shorter  method 
presented  in  Bulletin  119  for  finding  the  mean  and  the  standard 
deviation.  If  applied  to  the  case  of  the  number  of  rows  of  kernels 
and  circumference  of  ears  of  corn,  the  formula  is 


_CRCC) 


..--(3) 


where  DK',  Dc'  are  deviations  from  our  guesses  at  the  means  instead 
of  deviations  from  the  means  themselves  ;  and  CB,  Cc  are  the  cor- 
rections applied  to  the  guesses  at  the  mean  number  of  rows  of 
kernels  and  circumference  respectively  in  finding  the  means  and 
standard  deviations. 

In  the  actual  work  of  calculating  r  from  formula  (3),  it  is 
highly  important  to  have  a  systematic  form  in  which  to  arrange 
the  work,  in  order  to  avoid  confusion  in  the  somewhat  compli- 
cated details.  It  seems  desirable,  for  this  teason,  to  describe  the 
arrangement  of  the  actual  work  as  shown  in  Fig.  3. 

Having  given  the  correlation  table,  we  first  add  the  numbers 
in  the  arrays  with  respect  to  both  characters  ;  that  is,  add  numbers 
in  rows  and  columns  of  the  table.  This  gives  two  frequency  dis- 
tributions —  the  one  with  respect  to  circumference  exhibited  in  the 
vertical  column  headed  /c,  and  the  other  with  respect  to  rows  of 
kernels  shown  in  the  horizontal  column  of  figures  marked  /B.  For 
each  of  these  frequency  distributions,  the  means  and  standard  devi- 
ations are  calculated  by  the  shorter  method  explained  and  applied 
in  Bulletin  119. 
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The   results   are 


<TR=  2.396, 

where  Mc,  MB  represent  mean  circumference  and  mean  number  of 
rows  of  kernels  respectively,  while  <rc,  o-R  are  corresponding  stand- 
ard deviations. 

The  columns  of  figures  marked  fc,  Dc',  /CDC',  /CDC'2,  f^,  DB'r 
/•RDK'  ,  /BDB'2  are  all  self  explanatory  to  one  familiar  with  the 
meaning  of  algebraic  symbols,  and  who  knows  how  to  find  the 
mean  and  standard  deviation. 


SOME  CHARACTERS  OF  INDIAN  CORN. 


299 


There  remains  the  column  of  figures  headed  ^  D'B  Dc',  which 
we  shall  endeavor  to  explain  in  detail,  as  this  is  the  only  part  of 
the  computation  that  is  actually  new  to  one  who  knows  how  to 
calculate  the  variability  of  a  population.  In  finding  the  means,  \ve 
get  the  deviations  of  class  marks  from  our  guesses  at  mean  cir- 
cumference and  mean  number  of  rows  of  kernels  on  ears  of  corn. 
These  deviations  are  marked  Dc',  DR'.  For  example,  in  row  i, 
we  find  3  ears  of  circumference  5  inches.  These  three  ears  devi- 
ate — 1.50  from  our  guesses  at  the  mean  circumference. 

We  next  form  the  product  of  each  number  of  the  correlation 
table  and  of  the  two  corresponding  deviations.  For  example, 
where  the  column  is  headed  16  and  the  row  is  labelled  6.00  inches 
intersect,  occurs  the  number  34.  These  34  ears  have  deviations 
from  our  guesses  of  — 0.50  and  — 2  as  is  indicated  by  the  sym- 
bols Dc'  and  D  '.  Hence,  for  this  number  34,  we  form  the  product 
34( — o.5o)( — 2)  =+34.  Without  regard  to  labor,  AVC  should  find 
such  a  product  for  each  compartment  of  the  correlation  table.  The 
sum  of  these  products,  with  due  regard  to  signs,  is  the  ^  DR'DC'  of 
formula  (3).  The  systematic  way  to  carry  out  this  work  is  to 
record  the  results  of  this  operation  for  each  horizontal  array  in 
line  with  the  array  under  the  heading  ^DB'DC',  and  then  to  add  the 
results  for  separate  arrays  to  obtain  432.00  \vhich  is  symbolically 
ndicated  by  ^  D^DC'. 

To  illustrate  the  method  of  calculation,  let  us  take  the  array 
of  circumference  6.00  inches  as  an  example.  We  have,  for  this 
array, 

-6X7 

-4X23 

—2X34     X(— 0.50) 

4    0X24 

2X3 
This  gives  98.00, 

For  the  array  of  mark  6.75 

— 6X1 
-4X8 
> 

X(0.25) 


—2X42 
0X50 

2X18 

4X5 
6X2 


This   gives  — I3-5O- 
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Treat  all  arrays  in  this  manner,  and  divide  the  sum  of  the 
products  thus  obtained  (that  is,  432.00)  by  the  number  of  variates 


769.     This  gives 


of  formula  (6)  and  equals  0.5618. 


Xext,  we  subtract  from  this  the  product  of  our  two  corrections 
in  finding  means.     That  is,  CBCC=— 0.071  5. 
To  subtract  this   negative   number,   we  must   add   0.0715.      This 

gives  0.6333    f°r  tne  numerical   value  of     c  R  -  CCCR    • 


r  = 


Cerre/of'on  of  Circumference  and_Rows. 
Number  of  Rowa  of  Kermis'. 


O 


=  0.501. 


Crop  1907-  Plot  401. 
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Fig.  4 
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4.  Modification  of  the  method  of  computing  r  (Fig.  4).  —  The 

computation  of  r  may,  in  general,  be  further  simplified  by  tak- 
ing the  difference  between  two  successive  class  marks  as  the  unit 
of  measurement.  That  is,  the  units  of  grouping  are  the  units 
throughout.  Then,  as  shown  in  Fig.  4,  the  deviations  from  the 
guesses  are  consecutive  integers.  This  method  gives  precisely  the 
same  value  for  the  correlation  coefficient  as  that  explained  under 
Fig.  3.  But  the  standard  deviations  and  the  corrections  to  guesses 
at  the  means  are  expressed  in  terms  of  differences  between  consecu- 
tive classes  as  units.  When  thus  expressed,  we  note  on  Fig.  4,  that 

o-c'=2.io8, 

0-^=1.198. 

Where  <rc'  is  the  standard  deviation  in  circumference  (expressed  in 
units  equal  to  the  difference  between  classes)  and  o-R'  is  the  stand- 
ard deviation  in  rows  of  kernels  (similarly  expressed).  To  ex- 
press <rc'  in  inches,  we  must  multiply  by  0.25.  This  gives  <rc  =0.527 
as  before. 

Similarly,  to  make  <rR  consistent  with  o-R  of  figure  3,  we  must 
multiply  by  2.     This  gives 


5.  Probable  error.  —  It  still  remains  to  find  the  probable  error 
in  our  computed  value.  The  general  meaning  of  the  probable  er- 
ror, and  its  use  in  indicating  the  degree  of  confidence  to  be  placed 
in  a  result  obtained  from  a  random  sample  of  a  population  has 
been  given  in  Bulletin  119.  It  seems  sufficient  here  to  give  merely 
the  formula  for  the  probable  error  in  the  coefficient  of  correla- 
tion r.  This  formula  is 

0.6745  0-r') 


0.6745  [  1  -  (0.501)'  1 

Applied  to  our  example,     kr  = 

—     0.018 

Hence,  we  write 

r=o.5Oi±o.oi8 
as  the  measure  of  the  correlation  in  question. 

6.  Use  of  the  correlation  coefficient — The  general  use  of  such 
a  precise  measure  of  correlation  as  is  given  by  r,  has  perhaps  been 
sufficiently  discussed  in  the  introduction.  However,  it  seems  well 
to  emphasize  here  that  in  the  selection  of  one  character,  we,  in 
general,  indirectly  select  correlated  characters,  and  change  their 
means  and  standard  deviations  accordingly.  Again,  if  in 
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the  selection  of  what  the  breeder  conceives  to  be  the  most 
desirable  type  for  parents,  he  artificially  increases  or  de- 
creases correlations  between  characters,  this  process  will,  in  gen- 
eral, change  the  correlation  between  these  characters  in  pa- 
rents and  in  offspring.  For  example,  it  appears  that  the  corre- 
lation between  length  of  ears  and  circumference  for  different 
types  of  corn  which  we  have  examined  varies  between  the  limits 
0.128  and  0.623.  Now,  we  have  examined  cases  in  connection  with 
this  work  where  48  parents  are  selected  so  as  to  exhibit  a  negative 
correlation  of  — 0.21  between  length  and  circumference.  If  we 
use  the  offspring  of  such  a  set  of  48  ears  and  ask  to  what  extent 
length  and  circumference  are  inherited,  or  if  we  ask  to  what  ex- 
tent these  characters  in  the  parent  are  correlated  with  the  yield,  it 
is  pretty  clear  that  we  have  much  complicated  our  problem  by  the 
selection  and  imposition  of  the  correlation  — 0.21.  Hence,  it  is 
highly  desirable  to  know  what  correlations  actually  exist  among 
different  characters,  before  we  should  expect  to  obtain  in  even  an 
approximately  precise  way  the  correlation  between  parent  and  off- 
spring (inheritance)  or  the  correlation  between  characters  in  the 
parent  and  yield. 

7.  The  regression  coefficient. — From  the  correlation  coefficient 
and  the  standard  deviations  of  each  of  the  two  characters,  it  is  easy 
to  obtain  what  is  known  as  the  regression  coefficient.     For  ex- 
ample, to  obtain  the  regression  coefficient  of  circumference  rela- 
tive to  the  number  of  rows  of  kernels,  multiply  the  coefficient  of 
correlation  by  the  standard  deviation  in  circumference  and  divide 
the  product  by  the  standard  deviation  in  the  number  of  rows  of 
kernels.     This  gives,  for  the  particular  example  above, 

r  —  =0.110. 

Similarly,  the  regression  of  the  number  of  rows  of  kernels  rel- 
ative to  the  circumference  of  ears:  is 

r  —  =  1.200. 

c 

8.  Use  of  the  regression  coefficient. — In  many  systems  of  cor- 
related variates,  the  regression  is  of  a  kind  described  in  the  ap- 
pendix as  linear  regression.     In  such  cases,  the  regression  coeffi- 
cient gives  us  a  useful  method  of  predicting,  from  a  given  value 
of  one  character,  the  most  probable  value  of  the  corresponding  cor- 
related character.     That  is  to  say,  from  the  selected  value  of  one 
character,  we  calculate  the  mean  value  of  the  corresponding  array. 
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For  the  particular  case  in  hand,  suppose  we  select  ears  with  twenty 
rows  of  kernels,  such  ears  deviate  2.736  above  the  mean  number 
of  kernels  on  ears,  the  regression  coefficient  (o. no)  of  circumfer- 
ence on  number  of  rows  of  kernels  indicates  that  we  should  expect 
the  mean  circumference  of  ears  of  20  rows  of  kernels  to  be 
(o. no)  (2. 736)  =0.301  inches  above  the  mean  circumference  for 
the  entire  population.  That  is,  ears  with  20  rows  should  have  a 
mean  circumference 

6-597+0.301=6.898. 

By  actual  computation,  the  mean  of  the  array  of  class  mark 
20  rows  is  6.927 ±0.12 1,  so  that  the  regression  coefficient  gives 
the  mean  of  the  array  to  within  deviations  due  to  random  sampling. 

To  be  more  general,  if  we  select  an  ear  of  any  deviation  x  in 
the  number  of  rows  of  kernels  from  the  mean,  we  should  expect 
the  deviations  in  circumference  of  corresponding  ears  to  center 
about  o.i  i  ox. 

Similarly,  if  we  select  an  ear  of  any  deviation  y  in  circumfer- 
ence from  the  mean  circumference,  we  should  expect  the  deviation 
in  number  of  rows  of  kernels  to  center  about 

i. 20  y. 

In  beginning  the  discussion  of  the  use  of  the  regression  coeffi- 
cient, we  limited  our  remarks  to  linear  regression.  This  means 
that  if  the  correlation  table  is  constructed  to  scale,  and  the  mean 
values  of  arrays  be  plotted,  these  mean  values  will  lie  along  a 
straight  line  to  within  deviations  to  be  attributed  to  random  samp- 
ling. Fortunately,  this  condition  is,  in  general,  w;ell  satisfied  in 
our  applications. 

It  is  important  in  every  case  to  examine  the  correlation  table 
to  ascertain  whether  the  means  of  systems  of  parallel  arrays  lie 
reasonably  near  a  straight  line. 

9.  Determination  of  the  correlation  coefficients  for  cer- 
tain physical  characters  in  corn. — Corn  grown  on  experi- 
mental plots  of  the  Illinois  Station  in  1907,  1908,  1909,  furnishes 
the  material  for  the  present  study  of  the  correlation  between  phys- 
ical characters  in  corn,  and  for  the  problem  of  quantitative  laws  of 
inheritance  of  these  characters.  It  should  be  understood  that  the 
problem  of  inheritance  is  a  problem  of  the  correlations  between 
ancestry  and  offspring.  The  characters  with  which  we  propose  to 
deal  here  are :  length,  weight,  circumference  of  ears,  and  the  num- 
ber of  rows  of  kernels  on  ears. 

While  we  have  done  considerable  work  on  the  inheritance  of 
these  characters  and  hope  to  publish  these  results,  it  appears  better 
to  present  in  the  present  bulletin  the  correlations  between  charac- 
ters as  we  find  them  in  large  populations,  with  very  little  reference 
to  heredity.  We  do  this  because,  with  our  material,  as  is  very 
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general  in  the  quantitative  study  of  inheritance  in  plants,  the  ques- 
tion of  the  precision  of  the  results  is  complicated  by  the  fewness 
of  parents  relative  to  the  number  of  offspring,  as  well  as  by  the 
rather  stringent  selection  of  parents.  We  think  it  expedient  to 
defer  to  a  later  bulletin  the  treatment  of  these  difficulties.  Further, 
it  is  important  to  know  these  correlations  before  attempting  a  study 
of  inheritance  or  of  the  correlation  between  characters  in  parent 
ears  and  yield. 

In  the  comparison  of  two  statistical  results,  the  difference  be- 
tween the  two  results  compared  to  its  probable  error  is  of  great 
value.  In  general,  we  may  take  the  probable  error  in  a  difference 
to  be  the  square  root  of  the  sum  of  the  squares  of  the  probable 
errors  of  the  two  results.  If  the  difference  does  not  exceed  two  or 
three  times  the  probable  error  thus  obtained,  the  difference  may 
reasonably  be  attributed  to  random  sampling.  If  the  difference  be- 
tween the  two  results  is  as  much  as  5  to  10  times  the  probable  error, 
the  probabilty  of  such  differences  in  random  sampling  is  so  small 
that  we  are  justified  in  saying  that  the  difference  is  significant.  In 
fact,  a  difference  of  ten  times  its  probable  error  is  certainly  signifi- 
cant in  so  far  as  there  is  certainty  in  human  affairs. 

Such  significant  differences  in  our  applications  may  perhaps  be 
well  divided  into  three  classes : 

1 i )  Those  due  to  differences  in  variety  of  corn. 

(2)  Those  due  to  seasonal  influences. 

(3)  Those  due  to  difference  in  soil  treatment. 

(4)  Those  to  be  attributed  to  selection  of  parents. 

10.  Source  of  material. — The  material  for  this  study  is  fur- 
nished by  the  crops  obtained  from  a  number  of  the  regular  experi- 
ment plots  which  are  being  conducted  for  different  purposes.  These 
plots  may  be  considered  as  belonging  to  two  different  groups,  one 
of  which  is  devoted  primarily  to  soil  investigation  and  the  other 
to  experiments  in  corn  breeding. 

The  soil  plots  comprise  what  are  designated  as  the  400  and  500 
series.  They  are  devoted  to  a  two-year  rotation  consisting  of  corn 
alternating  with  oats,  that  is  to  say,  corn  occupies  the  400  series 
one  year  and  the  500  series  the  next  year. 

Each  series  is  divided  into  ten  plots  of  one-tenth  acre  number- 
ed from  401  to  410  and  from  501  to  510.  To  these  plots  various 
soil  treatments  have  been  applied  as  follows : 

401  and  501 — None  (check  plot). 

402  "     502 — Legume  catch-crop  and  crop  residues.* 


*Corn  stalks  and  oat  straw  plowed  under,  removing  only  the  grain  from 
the  land. 
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403  and  503 — Farm  manure. 

404  504 — Legume  and  crop  residues  and  lime. 
4°5  5°5 — Manure  and  lime. 

406  506 — Legume  and  crop  residues,  lime  and  phosphorus. 

4°7  507— Manure,  lime  and  phosphorus. 

408  508 — Legume  and  crop  residues,  lime,  phosphorus  and 

potassium. 

409  509 — Manure,  lime,  phosphorus  and  potassium. 

410  510 — Legume  and  crop  residues  with  extra  heavy  ma- 

nure and  phosphorus.* 

The  yields  from  these  variously  treated  plots  are  given  in  the 
following  tables  in  connection  with  the  other  data. 

This  brief  description  will  serve  to  explain  in  a  general  way 
the  significance  of  the  various  plots  and  the  following  data  per- 
taining to  them.  The  reader  who  may  be  interested  in  a  more  de- 
tailed account  regarding  the  arrangement,  description  and  his- 
tory of  these  soil  plots  is  referred  to  Bulletin  125  of  this  Station, — 
"Thirty  Years  of  Crop  Rotations  on  the  Common  Prairie  Soils  of 
Illinois,"  where  are  given  the  complete  records. 

The  particular  variety  of  corn  grown  upon  these  plots  has  been 
two  strains  of  Learning  which  have  been  under  selection  for  a 
number  of  years  for  high-protein  and  low-protein  content  respect- 
ively, the  work  being  controlled  by  the  method  of  "mechanical  se- 
lection" described  in  Bulletins  55-87-100. 

To  be  more  definite,  in  1907,  1909,  seed  corn  low  in  protein 
content  was  planted,  while  in  1908,  seed  corn  high  in  protein  was 
planted  on  the  400  and  500  series  concerned  in  this  investigation. 

Although  a  material  difference  in  composition  between  the  two 
strains  has  been  effected  through  this  method  of  selection,  this 
difference  does  not  seem  to  have  significantly  affected  the  correla- 
tion values  tinder  consideration,  as  will  appear  from  the  results  of 
this  bulletin. 

The  corn  breeding  plots  from  which  samples  have  been  taken 
for  these  correlation  studies  represent  four  lines  of  selection  which 
have  been  under  way  since  1896,  the  object  in  view  being  to  change 
the  normal  composition  of  the  grain  of  a  variety  of  corn  by  pro- 
ducing strains  of  special  chemical  characteristics. 

In  this  manner  four  strains  of  markedly  different  chemical 
composition  have  arisen  from  a  single  variety  by  selecting  contin- 
uously for 


*Five  times  the  ordinary  application  of  manure  and  of  phosphorus. 
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I — High-protein  content 
2 — Low-protein  content 
3 — High-oil  content 
4 — Low-oil  content. 

For  the  history  and  the  results  of  the  first  ten  generations  of 
this  work  the  reader  is  referred  to  Bulletin  128,  "Ten  Generations 
of  Corn  Breeding."  The  variety  under  experiment  contained  orig- 
inally in  1896  an  average  of  10.92  percent  of  protein  and  4.70 
percent  of  oil. 

The  composition  of  the  different  strains  for  the  years  herein 
concerned  is  as  follows : 

High  Protein.  Low  Protein.  Hjigh  Oil.  Low  Oil. 

1907    J3-89  7-32  7-43  2.59 

1908    13.94  8.96  7.19  2.39 

1909    I3-4I  7-65  .  6.96  2.35 

II.  Discussion  of  results. — With  a  set  of  the  four  characters 

under  consideration,  there  are  possible  six  pairs  of  variates  between 
each  pair  of  wrhich  we  can  determine  the  correlation.  From  the 
results  of  the  tables,  it  will  be  observed  that  we  have  carried  out 
the  determination  of  the  correlation  coefficient  for  some  plots  for 
each  of  these  six  pairs  of  characters.  As  the  correlations  coeffi- 
cients we  have  determined  and  given  in  this  paper  seem  sufficient 
to  give  a  good  general  notion  as  to  the  value  of  correlations  be- 
tween these  characters,  it  has  appeared  as  well  to  defer  further  cal- 
culations of  correlation  coefficients  until  we  ascertain  whether  fur- 
ther special  determinations  will  be  of  service  in  problems  of  in- 
heritance of  these  characters  and  their  correlations  with  yield, — 
the  problems  to  which  we  regard  the  present  investigation  as  pre- 
liminary. The  accompanying  tables  include  the  results  of  141  de- 
terminations of  correlation. 

Two  year  rotation  corn. — In  length  and  circumference,  the 
correlation  centers  about  0.33  for  the  year  1907,  0.47  for  1908,  and 
0.49  for  1909.  For  the  three  years  together,  the  values  center 
about  0.43.  The  smallest  correlation  is  given  by  data  from  plot 
409  of  1907.  This  correlation  is  0.203.  The  greatest  correlation 
is  0.623  furnished  by  data  of  plot  402  in  1909.  These  extreme 
values  are  very  different  as  seen  by  comparison  with  their  prob- 
able errors.  They  belong  to  different  years,  and  the  value  0.623 
corresponds  to  a  decidedly  low  yield  while  0.203  corresponds  to  a 
high  yield.  In  fact,  there  seems  to  be  a  somewhat  general  ten- 
dency towards  high  correlation  of  length  and  circumference  when 
the  yield  is  low  and  vice  versa.  There  are  three  small  values  of 
correlation  between  length  and  circumference.  These  belong  to 
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three  consecutive  plots  of  1907.  Inspection  of  the  correlation  table 
shows  that,  in  these  cases,  there  is  considerable  deviation  from 
linear  regression — the  ears  of  extremely  large  circumference 
tended  to  be  shorter  than  ears  of  less  extreme  circumference. 

In  length  and  number  of  rows,  the  correlations  are  insignifi- 
cant except  possibly  in  one  case  where  r  is  more  than  four  times 
its  probable  error. 

In  circumference  and  rows,  the  correlation  centers  about  0.486 
in  the  year  1907,  0.499  m  1908,  0.467  in  1909.  The  extremes  pre- 
sented are  0.425  and  0.608,  which  do  not  differ  so  much  as  the 
extremes  for  length  and  circumference.  While  the  deviations  are 
too  great  to  be  assigned  to  random  sampling,  it  appears  that  there 
is  both  a  difference  in  season,  soil,  and  parentage. 

In  length  and  weight  of  ears,  the  mean  value  of  the  correla- 
tion is  0.810  in  1909,  and  these  correlations  did  not  show  very 
great  differences  for  different  plots.  We  may  regard  0.8  as  sort 
of  rough  value  for  this  correlation. 

In  weight  and  rows  of  kernels,  we  have  values  from  0.178  to 
0.345.  In  weight  and  circumference,  we  have  values  from  0.648 
to  0.840. 

The  Illinois  Corn. — In  length  and  circumference,  the  correla- 
tion are  very  different  for  selected  strains.  The  conditions  are 
complicated  by  differences  of  soil,  and  season.  The  low  oil  plot  of 
1907  gives  the  lowest  correlation,  whereas  the  low  oil  plot  of  1908 
gives  the  highest  correlation  of  the  entire  series. 

In  correlation  between  circumferences  and  rows  of  kernels, 
there  are  much  smaller  differences  between  plots  than  for  length 
and  circumference. 

Between  length  and  rows  of  kernels,  there  appears  to  be  no 
significant  correlation,  except  possibly  in  one  case. 

Between  length  and  weight,  the  correlations  are  not  very  dif- 
ferent, and  fall  roughly  between,  0.65  and  0.85. 

Arranging  the  pairs  of  systems  of  variates  in  descending  order 
as  to  correlation,  we  have  the  following  order : 

(1)  Length  and  weight. 

(2)  Circumference  and  weight. 

(3)  Circumference  and  rows  of  kernels. 

(4)  Length  and  circumference. 

(5)  Weight  and  rows  of  kernels. 

(6)  Length  and  rows  of  kernels. 

For  this  arrangement  the  odds  are  pretty  large  except  in  the 
case  of  (3)  and  (4),  and  possibly  (i)  and  (2).  As  a  sort  of  gen- 
eral conclusion,  we  may  say  that  the  correlations  between  length- 
weight  and  circumference-weight  are  high.  The  correlations  of 
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circumference-rows  of  kernels,  and  length-circumference  are  con- 
siderable. The  correlation  of  weight-rows  of  kernels  is  low,  while 
that  of  length-rows  is  probably  insignificant. 

These  correlations  mean  that  the  tendency  towards  a  relative 
proportioning  of  length  and  circumference;  and  circumference 
and  rows  of  kernels  is  considerably  greater  than  that  towards  a 
relative  proportioning  of  weight  and  rows  of  kernels  or  length 
and  rows  of  kernels. 

It  seems  somewhat  disappointing  that  the  correlation  coeffi- 
cients differ  so  widely,  as  this  fact  complicates  the  problem  of  as- 
sessing the  influence  of  the  selection  of  parents  in  a  precise  meas- 
ure of  heredity.  The  wide  difference  for  different  pairs  of  char- 
acters may  be  compared  with  the  correlations  between  different 
pairs  of  characters  of  the  human  body,  where  it  has  been  found 
that  between  measurement  of  the  long  bones  of  the  arms  and  legs 
a  high  correlation  exists,  while  between  different  measurement  of 
the  skull  a  much  lower  correlation  exists.* 

TABLE  1. — CORRELATION  AMONG  CERTAIN  CHARACTERS  OF  EARS  OF  CORN 
400  SERIES.  CROP  1907.  SEED:  Low  PROTEIN  BY  MECHANICAL  SELECTION 


Yield 

t 

Value  of  r  for 
length  and  circum- 
ference 

Value  of  r  for 
length  and  rows 

Value  of  r  for 
circumference  and 
rows 

401 

68.4 

0  423+0.019 

—0.044+0.024 

0.501  ±0.019 

402 

69.9 

0.438+0.019 

+0.007+0.024 

0.446+0.020 

403 

69.4 

0.312+0.020 

0.484+0.018 

404 

75.9 

0.462+0.018 

0.548+0.017 

405 

66.6 

0.452+0.018 

0.502±0.019 

406 

84.6 

0.403+0.019 

0.470+0.018 

407 

68.6 

0.282+0.021 

0.440+0.020 

408 

84.1 

0.278+0.021 

0.554+0.016 

409 

71.4 

0.203+0.021 

0.487+0.019 

410 

95.6 

0.411+0.017 

0.432+0.018 

Value  of  r  for 
length  and  weight 

Value  of  r  for 
rows  and  weight 

Value  of  r  for 
weight  and  circum- 
ference 

401 

0.781+0.008 

0.275+0.023 

0.768+0.009 

405 

0.786+0.009 

0.223+0.024 

0.721+0.011 

*Biometrika,  Vol.   i,  pp.  408-467. 
fComputed  at  80  Ib.  per  bushel  of  ear  corn. 
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TABLE  2.— CORRELATION  AMONG  CERTAIN  CHARACTERS  OF  EARS  OF  CORN 
500  SERIES.    CROP  1908.     SEED:  HIGH  PROTEIN  BY  MECHANICAL  SELECTION 


Yield 

* 

Value  of  r  for 
length  and  circum- 
ference 

Value  of  r  for 
length  and  rows 

Value  of  r  for 
circumference  and 
rows 

501 

37.5 

0.590-1-0.014 

0.090+0.025 

0.514+0.019 

502 

45.0 

0.528-1-0.015 

0.061±0.026 

0.506+0.019 

503 

33.1 

0.562+0.015 

0.120+0.027 

0.432+0.022 

504 

39.3 

0.526-1-0.016 

0.486+0.021 

505 

39.6 

0.519-t-0.016 

0.608±0.017 

506 

76.1 

0.422+0.015 

0.480+0.016 

507 

46.1 

0.385  -(-0.018 

0.444+0.020 

508 

74.1 

0.360+0.016 

0.517+0.016 

509 

39.8 

0.444±0.017 

0.508±0.019 

510 

75.0 

0.344+0.017 

0.497+0.015 

Value  of  r  for 
length  and  weight 

Value  of  r  for 
rows  and  weights 

Value  of  r  for 
weight  and  circum- 
ference 

501 

0.855+0.006 

0.345+0.021 

0.771+0.009 

505 

0.871+0.005 

0.348+0.021 

0.763+0.009 

Computed  at  80  Ib.  per  bushel  of  ear  corn. 


TABLE  3. — CORRELATION  AMONG  CERTAIN  CHARACTERS  OF  EARS  OF  CORN 
40  SERIES.    CROP  1909.    SEED:  Low  PROTEIN  BY  MECHANICAL  SELECTION 


Yield 

* 

Value  of  r  for 
length  and  circum- 
ference 

Value  of  r  for 
length  and  rows 

Value  of  r  for 
circumferences    and 
rows 

401 

46.2 

0.548+0.016 

0.004+0.027 

0.452+0.021 

402 

38.6 

0.623+0.016 

0.027+0.034 

0.466+0.026 

403 

48.4 

0.453+0.118 

-0.044+0.026 

0.514+0.022 

404 

43.6 

0.534+0.016 

0.463+0.022 

405 

47.2 

0.461+0.018 

0.487+0.022 

406 

57.6 

0.506+0.017 

0.482+0.019 

407 

46.0 

0.443+0.019 

0.524+0.020 

408 

58.8 

0.432+0.018 

0.428+0.021 

409 

51.6 

0.409+0.019 

0.458+0.022 

410 

72.6 

0.539+0.012 

0.425+0.018 
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TABLE  3. — Concluded. 


Value  of  r  for 
length  and  weight 

Value  of  r  for 
weight  and  rows 

Value  of  r  for 
weight  and  circum- 
ference 

401 

0.818-1-0.008 

0.216+0.025 

0.840+0.007 

402 

0.844-1-0.008 

0.225+0.032 

0.746+0.012 

403 

0.815+0.008 

0.178+0.027 

0.648+0.013 

404 

0.801-1-0.010 

0.212+0.029 

0.757+0.012 

405 

0.810+0.008 

0.229+0.027 

0.728+0.011 

406 

0.791+0.008 

407 

0.800+0.008 

408 

0.798+0  008 

409 

'  , 

0.785+0.008 

410 

0.843+0.005 

*  Computed  at  50  Ib.  per  bushel. — Shelled  corn  (dry  substance). 


TABLE  4. — CORRELATION  AMON^   CERTAIN  CHARACTERS  OF  EARS  OF  CORN 
ILLINOIS  COK>T.     CROPS  1907,  1908,  1909 


Value  of  r  for 
length  and 
circumference 

Value   of  r  fo» 
circumference 
and  rows 

Value  of  r  for 
leiio*^1  and 
rows 

Value  of  r  for 
length  and 
weight 

Crop  1907 

High  protein  . 
Low  proein  .  . 
High  oil  

0.202  ±0.031 
0.368+0.028 
0.317+0.029 
0  128+0.035 

0.490±0.026 
0.520+0.025 
0.431+0.027 
0.562±0.024 

-0.051+0.032 
—  0.017±0.034 

0.822+0.013 
0.725+0.016 
0.838+0.009 
0.727±0  017 

Crop  1908 

High   protein  . 
Low  protein  .  . 
High  oil  

0.310+0.027 
0.293+0  027 
0  132+0.027 

0.435+0.025 
0.488+0.024 
0.464+0  022 

—0.106+0.030 
—0.017+0.031 

0.776+0.012 
0.809+0.012 
0  673+0  015 

Low  oil  

Crop  1909 

High   protein  . 
Low  protein  .  . 
High  oil  

0.569+0.020 

0.183+0.030 
0.437+0.022 
0  299+0.025 

0.459+0.025 

0.592+0.021 
0.305+0.029 
0.335+0  026 

—0.081+0.035 

0.868+0.007 

0.681+0.016 
0.770+0.011 
0  769+0  013 

Low  oil  . 

0.255+0.024 

0.480+0.020 

0.675+0.014 

APPENDIX  ON  THE  MATHEMATICAL  THEORY  OF 
CORRELATION. 

1.  Mathematical    function. — A    variable    y  is  said   to   be   a 
mathematical  function  of  a  variable  x  if  they  are  so  related  that 
to  assigned   values   of   x   there   correspond   definite   values  of  y. 
Thus,  if  y=2x+4,  y  is  a  function  of  x;    since,  for  any  assigned 
value  of  x,  we  can  compute  y.     Those  who  are  familiar  writh  ana- 
lytic geometry  know  that  a  curve  is  useful  for  representing  and 
following  the  variations  of  a  mathematical  function. 

\Ye  shall  assume,  in  the  present  treatment  of  correlation,  a 
knowledge*  of  the  use  of  a  system  of  co-ordinate  axes  to  represent 
numbers  and  functions. 

In  order  to  place  the  notion  of  correlation  on  a  precise  basis, 
we  lay  down  the  following  special 

2.  Definition. t — TWo  measurable  characters  of  an  individual 
or  of  related  individuals  are  said  to  be  correlated  if  to  a  selected 
scries  of  sizes  of  the  one  there  correspond  sizes  of  the  other  zvhose 
mean  values  are  functions  of  the  selected  values.    The  word  "sizes" 
is  used  in  the  sense  of  numerical  measure,  and  the  function  is  to 
be  different  from  zero  for  some  of  the  selected  values. 

To  be  concrete,  we  may  think,  for  example,  of  measuring  the 
correlation  between  length  and  circumference  of  ears  of  corn,  or 
the  correlation  of  fathers  and  sons  with  respect  to  stature. 

To  render  the  above  definition  in  symbolic  language  and  to 
develop  the  methods  of  determining  the  function  mentioned  in  the 
definition  are  the  first  points  in  the  application  of  mathematics  to 
the  theory  of  correlation.  For  this  purpose,  let  x  and  y  be  variables 
such  that  y=f(x)  gives  the  mean  value  of  a  system  of  variates 
which  correspond  to  a  selected  x.  Suppose  the  following  system 
of  corresponding  values  results  from  measurement:  (x',  y'), 
(x",  y"),  .  .  .  ,  (x(n) ,  y(n)  ),  where  n  is  a  large  number  indicat- 
ing the  total  number  of  pairs  observed.  These  observations  are 
said  to  form  a  total  population  or  universe  of  observations.  As  it 
is  more  convenient  to  deal  with  the  deviations  of  the  observations 


*See  Davenport's   Principles   of   Breeding,   pp.  687,  689. 

^Philosophical  Transactions  of  the  Royal  Society,  Vol.  i8~A,   pp.  256-257. 
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from  their  mean  values  than  with  the  observations  themselves,  let 
(xi>  yi)>  (X2>  V2)>  •  •  •>  (xn>  yn)  represent  the  deviations  of  the 
observations  from  their  mean  values.  These  deviations  may  be 
conveniently  represented  with  respect  to  co-ordinate  axes  (Fig.  5). 
The  origin  then  represents  the  mean  of  the  two  characters.  In 
fact,  we  may  think  of  the  co-ordinate  axes  as  passing  through  the 
mean  of  the  table  and  drawn  parallel  to  arrays.  The  vertical  par- 
allel lines  of  the  figure  may  then  be  looked  upon  as  separating  the 
observations  into  arrays.  The  values  of  the  y's  which  correspond 
to  a  given  class  mark  x  are  said  to  form  a  y-array.  Suppose  there 
are  s  such  arrays. 


x-*- 


Y 

FIG.  5. 

Let  the  crosses  ( X )  in  Fig.  5  represent  the  means  of  the  y's  in 
each  of  the  ^-arrays.  If  correlation  exists,  these  means  do  not  lie 
at  random  over  the  field,  but  arrange  themselves  more  or  less  in 
the  form  of  a  smooth  curve  called  the  "curve  of  regression."  This 
curve  is  a  crude  picture  of  the  function  which  defines  the  correla- 
tion of  the  ^-character  relative  to  the  .^-character.  Experience  has 
shown  that,  in  many  sets  of  measurements,  this  line  is  approxi- 
mately a  straight  line.  For  this  reason,  and  for  simplicity,  the  line 
subjected  to  the  condition  that  the  sum  of  the  squares  of  the  devia- 
tions (measured  parallel  to  the  ^-axis  and  weighted  with  number 
of  points  in  array)  of  the  means  from  it  shall  be  a  minimum,  is 
called  the  "line  of  regression."  When  the  means  lie  exactly  on 
the  line,  the  regression  is  said  to  oe  "truly  linear." 
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Let  y=mx+b  be  the  function  which  represents  the  line  of  re- 
g>ression,  then  the  problem  of  determining  the  line  is  that  of  de- 
termining m  and  b  by  means  of  the  above  minimal  condition.  The 
algebraic  details  of  subjecting  a  line  to  this  minimal  condition  are 
well  known  to  those  familiar  with  the  method  of  least  squares  or 
the  method  of  moments.  The  equation  of  the  resulting  line  is 

y  =  r^Lx, (1) 

ffx 

where  o-x  is  the  standard  deviation  of  the  population  with  respect 
to  the  x-character,  cry  is  the  standard  deviation  with  respect  to 
the  y-character,  and  r  is  the  correlation  coefficient  given  by 

S  xy 
r    —  • 


where  the  summation  is  extended  to  every  pair  of  corresponding 
variates  of  the  population.  Similarly,  the  regression  of  the  x  char- 
acter on  the  y  character  is  given  by 

x  =  r_!l  y (2) 

°y 

It  should  be  noted  that  (2)  cannot  be  obtained  by  solving  (i) 
for  x,  for  the  reason  that  the  correspondence  is  one  between  se- 
lected values  and  means. 

3.  Standard  deviation  of  arrays. — Suppose  that  regression 
is  truly  linear,  so  that  the  means  of  the  y-arrays  fall  on  the  line 

a „         ; 

y  =  r_il  x  j ;  and,  for  the  present,  assume  that  the  standard  devi- 
ations of  arrays  are  equal.  Then  the  standard  deviation  of  an 
array  is  given  by 


n 
where  the  summation  extends  to  the  entire  population. 

= •  —  r  +        r 


T    *         2  rz  a 
y     —  y 
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Hence,  the  standard  deviation  of  a  y-array  is  obtained  from  the 
standard  deviation  o-y  by  multiplying  vy  by  i/i-r2  • 

If  the  standard  deviations  of  parallel  arrays  are  unequal,  then 
<Ty  i/  i_r*  is  simply  sort  of  an  average  value  for  the  standard 
deviation  of  an  array. 

Since  the  first  member  of  (3)  is  a  sum  of  squares  divided  by 
n,  the  second  member  must  be  positive.  Hence  —  i<r^i. 

This  proves  that  the  correlation  coefficient  takes  values  not 
greater  than  +i  nor  less  than  — i. 

Equation  (3)  shows  further  that  if  r=H-i  all  the  individual 
points  plotted  from  observations  must  lie  on  the  line  of  regression, 
and  we  can  in  this  case,  when  one  character  is  given,  tell  exactly 
the  magnitude  of  the  associated  character.  Further,  the  ratio 

xt       x,  x 

=  __=     -    -    -    =  _   =  a  positive  constant. 

yi     y2  ya 

Similarly,  if  r= — i,  the  individual  points  plotted  all  lie  on 
the  line  of  regression,  but 

Xl  X2  xn 

= =     ...    = =  a  negative  constant. 

y\     ?2  ya 

4.  Correlations  among  three  or  more'characters.— The  theory 
of    correlation    can    be   extended    to    apply   to    any    number    of 
variables.     However,  the  complexity  of  the  algebraic  expression 
for  any  number, — say  n-variables — becomes  so  great  that  it  does 
not  seem  well  to  present  a  more  extended  discussion  here,  except 
to  say  that  the  final  result  is  expressed  in  standard  deviations  and 
correlation  between  systems  of  variates  in  sets  of  two,  so  that  the 
problem  is  capable  of  reduction  to  the  one  which  we  have  solved. 

For  the  general  case,  the  reader  with  considerable  mathematical 
training  is  refered  to  the  treatment  by  Karl  Pearson  in  the  Philo- 
sophical Transactions  of  the  Royal  Society,  A,  187,  1896,  and  A, 
200,  1903. 

5.  Correlation     surfaces. — If    our    frequency     distributions 
follow  normal  probability  curves   (Bulletin  119,  pp.  30-31)  there 
can  be  derived  a  surface 

z  =  /(x,  y) 

such  that  /  (x,  y,)  h.  k  gives,  to  within  deviations  due  to  random 
sampling,  the  number  of  the  population  with  corresponding  meas- 
urements in  the  region  bounded  by  x=x,  y=y,  x=x+h,  y=y+k, 
where  the  x  and  y  are  deviations  from  mean  values  and  h  and  k 
are  any  small  numbers.  For  a  considerable  range  of  statistical 
data,  this  surface  takes  the  form 


SOME  CHARACTERS  OF  INDIAN   CORN.  315 

1  .x2  y'  x  y 


Z=    2(l-r3 

2  / /-^  x 

*"    x     y  I/  l_r2 


~1 


where  e  is  the  base  of  natural  logarithms  and  the  other  symbols 
have  been  defined  above.  The  symbol  TT  equals  3.1416,  the  ratio  of 
circumference  to  a  diameter  in  the  circle. 

As  the  only  parameters  in  this  surface  (aside  from  the  total 
number  n)  are  the  standard  deviations  and  the  correlation  coeffi- 
cient, we  have  in  the  standard  deviations  and  its  correlation  coeffi- 
cient a  perfect  description  of  a  normally  distributed  population. 
This  fact  adds  much  to  the  significance  of  r  as  a  measure  of  cor- 
relation. 

6.  Formula  for  the  correlation  coefficient  r  which  are  better 
adapted  to  numerical  calculation.  —  In  the  first  place,  the  calcu- 
lation of  the  means  and  standard  deviation  of  both  systems  of 
variates  should  be  done  by  the  shorter  method  presented  on  pp. 
9-1  1,  Bulletin  119. 

It  may  be  well  to  give  that  method  here  in  a  more  symbolic 
form  to  prepare  the  way  for  the  modified  formula  for  r  adapted 
to  calculation. 

Let  G  represent  a  guess  at  the  mean  M  given  by 

/.V.+A'-2+-         •    +      /.     T.        ,         _ 

/,  +  u  +  --•+/, 

where  the  class  marks  and  /'s  are  corresponding  frequencies.  Also 
let  c  be  the  correction  to  the  guess  G  which  gives  M.  That  is 

M  =  G  +  c  , 


-G. 


/,  (  V,  _  G  )    +/2  (Y,  -  G)  H  ----  +  /.  (v.  -  G  ) 

/,+/,+   -     -     -     -  "T7T 

Formula   (2)    gives  the  practical  method  of  finding  the  cor- 
rection to  be  applied  to  the  guess  to  get  the  mean. 
Next,  the  standard  deviation  is  given  by 

/t  (  vt  -  M  )*  +  /2  (  v2  -  M  )'  +  -   -  -  +  /a  (   vs  -  M  )'     , 


(  v  _  G  -  c  )'  +  /-,  (  v2  -  G  -  c  )'  +  -          +  /,  <  v  -  G  -  c  )*  , 
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S/t  (vt-G)'-2c2/t  (  vt^G)  +  c'2/t        , 
2/t 

S  /t  (  yt  _  G  )'  -  2  c  S/t  (  vt  -  M  +  c  )  +  c«  S/t        , 


»/t(Vt-Q)a  t 

~~i7 —         ~  "  *  ' 

Formula   (3)    gives   the  practical   method   of   calculating  the 
standard  deviation. 

The  value  of  r  is  given  by 

2  x  y 

r  :  > 

n  (T      G 

x      y 

where  x  and  y  represent  deviations  from  the  means,  and  the  sum- 
mation extends  to  every  pair  of  corresponding  variates. 

Let  Gx  and  Gy  represent  class  marks  near  the  means  of  the  sys- 
tems of  variates  indicated  by  subscripts,  and  Cx,  Cy  corrections  to 
these  class  marks  which  give  the  correct  mean  values  so  that 


My  =  Gy  +cy. 


Let  x',  y'  be  deviations  from  Gx  and  Gy  which  correspond  to 
deviations  x,  y  from  the  mean.    Then 


'-    Cx2y'+    2   Cx   Cy      , 


Cy 


2x'y' 

C      C 


/xy 

=    (  -     - 
V        n 


This  is  a  formula  whose  computation  is  shown  on  pp.  297-301. 
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The  bulletin  is  a  technical  presentation  of  the  methods  of  de- 
termining the  correlation  coefficient  for  associated  characters,  to- 
gether with  considerable  tabular  matter  giving  the  correlations 
for  various  strains  of  Indian  corn.  It  is  the  purpose  of  this 
abstract  to  present  the  leading  thought  of  the  bulletin  devoid  of 
technical  terms,  and,  omitting  all  reference  to  methods  of  cal- 
culation, to  discuss  briefly  the  meaning  of  the  correlation  coeffi- 
cient, present  the  data  involved,  and  assist  the  non-mathematical 
reader  to  an  understanding  of  its  significance.  Anyone  desir- 
ing to  pursue  the  subject  farther,  particularly  as  to  methods  of 
calculation  of  the  correlation  coefficient,  can  secure  the  complete 
text  upon  request  to  the  Agricultural  Experiment  Station. 

Any  one,  who  has  at  all  considered  the  matter,  is  conscious 
that  there  is  correlation  of  some  characters,  both  in  animals  and 
plants.  That  is  to  say,  that  the  different  characteristics  that  go 
to  make  up  the  individual  animal  or  plant  do  not  ex'st  indepen- 
dently of  one  another,  but  on  the  contrary  are  more  or  less  co- 
related,  or  bound  together  by  such  physiological  bonds  as  com- 
pel them  to  move  more  or  less  with  reference  to  each  other. 

Among  those  who  have  not  studied  the  matter  carefully,  but 
rely  merely  upon  personal  impressions  derived  from  unsystematic 
observation,  it  appears  that  a  notion  prevails  pretty  commonly 
that  characters  are  either  perfectly  correlated  or  entirely  uncor- 
related,  the  conception  being  that  the  characters  are  either  ab- 
solutely bound  together  or  else  they  move  with  complete  inde- 
pendence. The  truth  is,  however,  that  characters  are  seldom  per- 
fectly correlated,  just  as  they  are  seldom  independent.  The  math- 
ematician follows  accurate  methods  in  determining  precisely  what 
correlations  exist  between  characters  in  large  populations.  He 
has  no  method  of  determining  the  bond  between  two  or  more 
characters  from  a  single  or  even  a  few  individuals.  He  deals 
only  w7ith  large  numbers,  and  by  his  methods,  he  is  able  to  dis- 
tinguish very  clearly  whether,  in  general,  two  characters  tend  to 
move  together  or  in  opposition  to  each  other,  and  approximately 
to  what  extent.  If  they  move  together,  correlation  is  said  to  be 
positive.  If  they  move  in  opposite  directions,  the  one  tending  to 
increase  proportionately  as  the  other  decreases,  the  correlation  is 
said  to  be  negative. 

The  present  bulletin  treats  the  precise  methods  of  measuring 
this  correlation.  It  is  measured  by  a  single  number  called  the 
correlation  coefficient,  denoted  by  r,  which  may  take  values  from 
-i  to  i,  depending  on  how  fluctuations  in  the  two  characters  take 
place.  If,  in  general,  the  characters  fluctuate  together,  say  either 
above  or  below  the  type,  the  value  of  (r)  lies  between  o  and  i 
depending  upon  how  closely  the  characters  are  correlated.  If,  in 


general,  two  corresponding  characters  fluctuate  in  opposite  direc- 
tions, the  correlation  is  between  o  and  -i.  The  values  r=+i, 
and  -i,  indicate  respectively  perfect  positive  correlation,  and  per- 
fect negative  correlation,  while  indifference  of  the  characters  to 
each  others  fluctuations  leads  to  the  value  r=o  when  very  large 
numbers  are  used. 

The  general  reader  is  not  concerned  with  the  methods  by 
which  these  values  are  obtained.  He  is  concerned  only  with  the 
results,  which  are  significant  and  extremely  valuable,  and  which 
with  a  little  practice  become  easily  apprehended  by  the  non-mathe- 
matical reader.  But  a  single  further  word  of  introduction  is  nec- 
essary, and  that  has  reference  to  the  so-called  probable  error,  a 
decimal  always  following  the  correlation  coefficient,  and  preceded 
by  the  +  or  —  sign.  This  probable  error  has  no  reference  to  mis- 
takes which  might  be  made  in  computation.  It  has  reference  to 
the  fact  that  any  value  which  may  be  determined  would  probably 
have  been  different  if  a  larger  number  of  individuals  had  been 
involved.  For  example,  if  it  is  desired  to  ascertain  what  is  the 
weight  of  mature  draft  horses  of  a  given  breed,  it  could  be  ob- 
tained approximately  by  weighing  100  such  horses.  It  could  be 
ascertained  with  greater  accuracy  by  weighing  1000  such  horses, 
but  there  is  no  absolutely  accurate  wray  of  determining  the  actual 
weight  until  every  draft  horse  of  that  age  in  the  world  has  been 
weighed.  The  so  called  probable  error  of  a  result  is  a  number 
that  enables  us  to  set  limits  within  which  we  may  reasonably  ex- 
pect the  result  to  be  found  if  we  should  use  larger  numbers  in 
establishing  a  result.  For  a  more  complete  statement  of  the  mean- 
ing and  applicability  of  the  probable  error,  see  Bulletin  119  of 
this  station. 

The  bulletin  treats  the  correlations  among  four  characters  of 
ears  of  corn — length,  circumference,  weight,  and  number  of  rows 
of  kernels.  The  practical  bearing  of  such  information,  as  is  con- 
tained in  the  results,  lies  in  the  facts,  ( I )  that  in  the  selection  of 
parents  for  one  character,  we  should  know  how  this  tends  to  change 
other  characters;  (2)  that  the  problem  of  the  correlation  of  char- 
acters and  yield  requires,  for  its  solution,  in  case  of  a  selected 
parentage,  a  knowledge  of  the  correlation  of  the  characters  among 
themselves  in  the  general  population  from  which  parents  are  se- 
lected; (3)  that  the  problems  of  inheritance  of  these  characters 
requires  a  knowledge  of  these  correlations. 

The  following  tables  give  the  correlation  coefficients  and  prob- 
able error  for  a  large  number  of  determinations  that  have  been 
made  at  the  Agricultural  Experiment  Station,  and  if  the  reader 
will  take  the  pains  to  compare  the  different  correlation  coefficients, 


TABLE  1. — CORRELATION   AMONG  CERTAIN  CHARACTERS  OK  EARS  OF  CORN 
400  SERIES.    CROP  1907.    SEED:  Low  PROTEIN  BY  MECHANICAL  SELECTION 


Yield 
* 

Value  of  r  for 
length  and  circum- 
ference 

Value  of  r  for 
length  and  rows 

Value  of  r  for 
circumference  and 
rows 

401 

68.4 

0.423+0.019 

—0.044+0.024 

0.501±0.019 

402 

69.9 

0.438-1-0.019 

-f  0.007+0.  024 

0.446+0.020 

403 

69.4 

0.312+0.020 

0.484+0.018 

404 

75.9 

0.4624-0.018 

0.548+0.017 

405 

66.6 

0.452  -HO.  018 

0.502±0.019 

406 

84.6 

0.403+0.019 

0.470+0.018 

407 

68.6 

0.282+0.021 

0.440+0.020 

408 

84.1 

0.278+0.021 

0.554+0.016 

409 

71.4 

0.203+0.021 

0.487+0.019 

410 

95.6 

0.411+0.017 

0.432+0.018 

Value  of  r  for 

Value  of  r  for 

Value  of  r  for 

* 

length  and  weight 

rows  and  weight 

weight  and  circum- 
ference 

401 

0.781+0.008 

0.275+0.023 

0.768+0  009 

405 

0.786+0.009 

0.223+0.024 

0.721+0.011 

*  Computed  at  80  Ib.  per  bushel  of  ear  corn. 


TABLE  2. — CORRELATION  AMONG  CERTAIN  CHARACTERS  OF  BARS  OF  CORN 
5  0  SERIES.    CROP  1908.    SEED:  HIGH  PROTEIN  BY  MECHANICAL  SELECTION 


Yield 
* 

Valus  of  r  for 
length  and  circum- 
ference 

Value  of  r  for 
length  and  rows 

Value  of  r  for 
circumference  and 
rows 

501 

37.5 

0.590+0.014 

0.090+0.025 

0.514+0.019 

502 

45.0 

0.528+0.015 

0.061  ±0.026 

0.506+0.019 

503 

33.1 

0.562+0.015 

0.120+0.027 

0.432+0.022 

504 

39.3 

0.526+0.016 

0.486+0.021 

505 

39.6 

0.519+0.016 

0.608±0.017 

506 

76.1 

0.422+0.015 

0.480+0.016 

507 

46.1 

0.385+0.018 

0.444+0.020 

508 

74.1 

0.360+0.016 

0.517+0.016 

509 

39.8 

0.444±0.017 

0.528±0.019 

510 

75  0 

0.344+0.017 

0.497+0  015 

Value  of  r  for 
length  and  weight 

Value  of  r  for 
rows  and  weights 

Value  of  r  for 
weight  and  circum- 
ference 

501 

0.855+0.006 

0.345+0.021 

0.771+0.009 

505 

0.871+0.005 

0.348+0.021 

0.763+0.  €09 

*  Computed  at  80  Ib.  per  bushel  of  ear  corn. 


TABLE  3. — CORRELATION  AMONG  CERTAIN  CHARACTERS  OF  EARS  OF  CORN 
400  SERIES.     CROP  1909.     SEED:   L,ow  PROTEIN  BY  MECHANICAL  SELECTION 


Yield 

* 

Value  of  r  for 
length  and  circum- 
ference 

Value  of  r  for 
length  and  rows 

Value  of  r  for 
circumferences   and 
rows 

401 

46.2 

0.548+0.016 

0.004+0.027 

0.452+0.021 

402 

38.6 

0.623-1-0.016 

0.027+0.034 

0.466+0.026 

403 

48.4 

0.453+0.118 

—0.044+0.026 

0.514+0.022 

404 

43.6 

0.534+0.016 

0.463+0.022 

405 

47.2 

0.461+0.018 

0.487+0.022 

406 

57.6 

0.506+0.017 

0.482+0.019 

407 

46.0 

0.443+0.019 

0.524+0.020 

408 

58.8 

0.432+0.018 

0.428+0.021 

409 

51.6 

0.409+0.019 

0.458+0.022 

414 

72.6 

0.539+0.012 

0.425+0.018 

Value  of  r  for 
length  and  weight 

Value  of  r  for 
weight  and  rows 

Value  of  r  for 
weight  and  circum- 
ference 

401 

0.818+0.008 

0.216+0.025 

0.840+0.007 

402 

0.844+0.008 

0.225+0.032 

0.746+0.012 

403 

0.815+0.008 

0.178+0.027 

0.648+0.013 

404 

0.801+0.010 

0.212+0.029 

0.757+0.012 

405 

0.810+0.008 

0.229+0.027 

0.728+0.011 

406 

0.791+0.008 

407 

0.800+0.008 

008 

0.798+0  008 

409 

0.785+0.008 

410 

0.843+0.005 

*  Computed  at  50  Ib.  per  bushel.       Shelled  corn  (dry  substance). 

he  will  learn  something  of  the  way  corn  behaves  under  a  variety 
of  conditions. 

The  material  for  this  study  is  furnished  by  the  crops  obtained 
from  a  number  of  the  regular  experiment  plots  that  are  being 
conducted  for  different  purposes.  These  plots  belong  to  two  dif- 
ferent groups,  one  of  which  is  devoted  primarily  to  soil  investi- 
gation (Tables  1-3),  and  the  other  to  experiments  in  corn  breed- 
ing (Table  4). 

The  soil  plots  are  designated  as  the  400  and  500  series.  They 
are  devoted  to  a  two  year  rotation  consisting  of  corn  alternating 
with  oats,  that  is  to  say,  corn  occupies  the  400  series  one  year 
and  the  500  series  the  next  year.  Each  series  is  divided  into 
ten  plots,  and  various  soil  treatments  are  applied.  For  a  detailed 
account  of  the  arrangement  and  the  soil  treatment  of  these  plots, 
the  reader  is  referred  to  the  bulletin  of  which  this  is  an  abstract, 
or  to  Bulletin  125  of  this  station. 


TABLE  4. — CORRELATION  AMONG   CERTAIN  CHARACTERS  OF   EARS  OF  CORN 

CORN.     CROPS  1907,  1908,  1909 


Value  of  r  for 
length  and 
circumference 

Value   of  r  for 
circumference 
and  rows 

Value  of  r  for 
length  and 
rows 

Value  of  r  for 
length  and 
weight 

Crop  1907 

High  protein  . 
Low  pro  ein  .  . 
High  oil   .  .    .  . 
Low  oil    .... 

0.202±0.031 
0.368+0.028 
0.317±0.029 
0  128-1-0.035 

0.490±0.026 
0.5204-0.025 
0.431  ±0.027 
0  562  ±0  024 

-0.051  4-0.  032 
—  0.017±0.034 

0.8224-0.013 
0.7254-0.016 
0.8384-0.009 
0  727  ±0  017 

Crop  1908 

High   protein  . 
Low  protein  .  . 
High  oil  

0.3104-0.027 
0.293±0.027 
0  132-f-0.027 

0.4354-0.025 
0.4884=0.024 
0.4644-0  022 

—0.1064-0.030 
—  0.017±0.031 

0.7764-0.012 
0.8094-0.012 
0  6734-0  015 

Low  oil  

0.5694-0.020 

0.459-(-0  025 

0.8684-0.007 

Crop  1909 

High   protein  . 
Low  protein  .  . 
High  oil  

0.183±0.030 
0.4374=0.022 
0.2994-0.025 

0.5924-0.021 
0.3054-0.029 
0.3354-0.026 

—  0.081  ±0  035 

0.6814=0.016 
0.7704-0.011 
0.769-t-0.013 

Low  oil  .      ... 

0.2554=0.024 

0.4804-0.020 

0.6754-0.014 

The  variety  of  corn  grown  upon  these  plots  has  been  two 
strains  of  Learning  which  has  been  under  selection  for  high  pro- 
tein and  low  protein  content  respectively.  In  1907  and  1909  the 
seed  corn  planted  was  low  in  protein  content  while  in  1908  it  was 
high  in  protein. 

The  corn  breeding  plots  from  which  the  material  for  Table 
4  was  taken,  represents  four  lines  of  selection  that  have  been  un- 
der way  since  1896  for  high  protein  content,  low  protein  content, 
high  oil  content  and  low  oil  content. 

With  a  set  of  four  characters  under  consideration,  there  are 
six  pairs  of  characters,  between  each  pair  of  which  the  correla- 
tion can  be  determined.  From  the  results  of  the  tables,  it  will 
be  observed  that  the  correlations  for  some  plots  are  given  for 
each  of  these  six  pairs  of  characters.  The  tables  include  the  re- 
sults of  141  determinations  of  correlation,  and  should  give  a  good 
general  notion  of  the  values  of  these  correlations  for  the  corn  un- 
der consideration. 

In  Table  i  are  given,  correlations  between  length  and  circum- 
ference, length  and  number  of  rows,  circumference  and  number 
of  rows,  length  and  weight,  weight  and  rows  of  kernels,  weight 
and  circumference,  for  some  plots  of  low  protein  corn,  differently 
treated,  crop  of  1907.  A  careful  study  of  this  Table  shows,  first 
of  all,  a  considerable  tendency  for  length  and  circumference  to 
move  together,  that  is,  for  long  ears  to  be  large  in  circumference, 


but  that  this  correlation  varies  greatly  in  the  different  plots,  rang- 
ing all  the  way  from  0.203  to  0.462.  Second,  there  is  practically  no 
correlation  between  the  length  of  ear  and  the  number  of  rows 
it  contains,  that  is  to  say,  one  is  no  index  whatever  to  the  other. 
Third,  there  is  a  fairly  high  positive  correlation  between  circum- 
ference and  the  number  of  rows,  which  means  that  the  large  ears 
have  in  general  more  rows  than  the  small  ears.  However,  the 
correlation  in  no  case  approaches  very  near  to  i.o,  which  means 
that  the  large  ears  have  not  only  more  rows  than  the  smaller  ones, 
but  the  kernels  are  larger.  There  is  a  high  correlation  for  length- 
weight,  and  weight-circumference,  but  a  rather  low  correlation 
between  weight  and  rows  of  kernels. 

In  Table  2  are  also  shown  ten  plots  of  high  protein  corn,  dif- 
ferently fertilized,  crop  of  1908.  The  same  general  traits  are 
maintained  as  in  the  former  table,  excepting  that  the  correlation 
runs  somewrhat  higher  between  length  and  circumference ;  and, 
that  there  is  perhaps  a  slight  positive  correlation  between  the 
length  of  the  ear  and  the  number  of  rows  that  it  contains. 

Table  3  exhibits  the  correlation  for  the  same  series  as  shown 
in  Table  i,  but  for  the  crop  of  1909,  and  a  more  complete  list 
of  correlations  between  length-weight,  weight-number  of  rows, 
and  weight-rows  of  kernels.  In  these  later  determinations,  we 
find  what  we  should  now7  expect,  namely,  a  high  correlation  be- 
tween length  and  weight  and  between  weight  and  circumference, 
and  a  rather  low7  correlation  between  weight  and  the  number  of 
rows. 

Table  4  exhibits  certain  correlations  for  the  so  called  Illinois 
corn,  which,  as  stated  above,  consists  of  four  strains  bred  for 
chemical  composition.  A  gross  comparison  of  this  table  will 
show  that  the  correlation  in  any  two  characters  varies  in  the  same 
strain  of  corn  in  different  years  as  it  does  by  different  methods 
of  treatment.  For  example,  the  correlation  between  length  and 
circumference  in  high  protein  corn,  crop  of  1907,  was  0.202.  The 
next  year  it  was  0.310,  and  the  next,  0.183.  The  reader  will  be 
interested  in  making  this  same  sort  of  comparison  for  other 
strains  of  corn  and  for  other  characters. 

Arranging  the  pairs  of  characters  in  descending  order  as  to 
correlation,  we  have  the  following  order : 

1 i )  Length  and  weight. 

(2)  Circumference  and  weight. 

(3)  Circumference  and  rows  of  kernels. 

(4)  Length  and  circumference. 

(5)  Weight  and  rows  of  kernels. 

(6)  Length  and  ro\vs  of  kernels. 


8  . 

For  this  arrangement,  the  odds  are  pretty  large  except  in  the 
case  of  (3)  and  (4),  and  possibly  of  (i)  and  (2). 

As  a  sort  of  general  conclusion,  we  may  say  that  correlations 
for  length-weight  and  circumference-weight  are  high.  The  cor- 
relation for  circumference-rows  of  kernels  and  length-circumfer- 
ence are  fairly  high.  The  correlation  of  weight-rows  of  kernels 
is  low,  \vhile  that  of  length-rows  of  kernels  is  probably,  in  general, 
insignificant. 
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